statistical error
An Adaptive Algorithm for Learning with Unknown Distribution Drift Alessio Mazzetto Brown University Eli Upfal Brown University
We develop and analyze a general technique for learning with an unknown distribution drift. Given a sequence of independent observations from the last T steps of a drifting distribution, our algorithm agnostically learns a family of functions with respect to the current distribution at time T. Unlike previous work, our technique does not require prior knowledge about the magnitude of the drift.
Divergence FrontiersforGenerativeModels: SampleComplexity, QuantizationEffects, andFrontierIntegrals
The spectacular success ofdeep generativemodels calls forquantitativetools to measure their statistical performance. Divergence frontiers have recently been proposed as an evaluation framework for generative models, due to their ability to measure the quality-diversity trade-off inherent to deep generative modeling. We establish non-asymptotic bounds on the sample complexity of divergence frontiers.
Structural Properties, Cycloid Trajectories and Non-Asymptotic Guarantees of EM Algorithm for Mixed Linear Regression
Luo, Zhankun, Hashemi, Abolfazl
This work investigates the structural properties, cycloid trajectories, and non-asymptotic convergence guarantees of the Expectation-Maximization (EM) algorithm for two-component Mixed Linear Regression (2MLR) with unknown mixing weights and regression parameters. Recent studies have established global convergence for 2MLR with known balanced weights and super-linear convergence in noiseless and high signal-to-noise ratio (SNR) regimes. However, the theoretical behavior of EM in the fully unknown setting remains unclear, with its trajectory and convergence order not yet fully characterized. We derive explicit EM update expressions for 2MLR with unknown mixing weights and regression parameters across all SNR regimes and analyze their structural properties and cycloid trajectories. In the noiseless case, we prove that the trajectory of the regression parameters in EM iterations traces a cycloid by establishing a recurrence relation for the sub-optimality angle, while in high SNR regimes we quantify its discrepancy from the cycloid trajectory. The trajectory-based analysis reveals the order of convergence: linear when the EM estimate is nearly orthogonal to the ground truth, and quadratic when the angle between the estimate and ground truth is small at the population level. Our analysis establishes non-asymptotic guarantees by sharpening bounds on statistical errors between finite-sample and population EM updates, relating EM's statistical accuracy to the sub-optimality angle, and proving convergence with arbitrary initialization at the finite-sample level. This work provides a novel trajectory-based framework for analyzing EM in Mixed Linear Regression.
Neural Triangular Transport Maps: A New Approach Towards Sampling in Lattice QCD
Bryutkin, Andrey, Marzouk, Youssef
Lattice field theories are fundamental testbeds for computational physics; yet, sampling their Boltzmann distributions remains challenging due to multimodality and long-range correlations. While normalizing flows offer a promising alternative, their application to large lattices is often constrained by prohibitive memory requirements and the challenge of maintaining sufficient model expressivity. We propose sparse triangular transport maps that explicitly exploit the conditional independence structure of the lattice graph under periodic boundary conditions using monotone rectified neural networks (MRNN). We introduce a comprehensive framework for triangular transport maps that navigates the fundamental trade-off between \emph{exact sparsity} (respecting marginal conditional independence in the target distribution) and \emph{approximate sparsity} (computational tractability without fill-ins). Restricting each triangular map component to a local past enables site-wise parallel evaluation and linear time complexity in lattice size $N$, while preserving the expressive, invertible structure. Using $ϕ^4$ in two dimensions as a controlled setting, we analyze how node labelings (orderings) affect the sparsity and performance of triangular maps. We compare against Hybrid Monte Carlo (HMC) and established flow approaches (RealNVP).